Search Engine Robots - How They Work, What They Do (Part II)
by Daria Goetsch

Why Isn't My Website In The Search Engine? 


If your site isn't found in the search engines, it is probably 
because the robots couldn't deal with it. It could be something 
as simple as not being able to find the site, or it may be more 
complicated issues involving the robot's not being able to crawl 
the site or figure out what your pages are all about.

Submitting your site to the major search engines will help 
with the "can't find it" problem. Even having links pointing 
back to your site can be enough to attract the search engine 
robots. Google, for example, suggests that you may not have to
submit your pages; they will find your site if you have a link 
pointing back to it from at least one other site on the web.

If the robots can find your site but can't make sense of it, 
then you may need to look at the content and technology used on
your pages. Frames, Flash, dynamically generated pages, and 
invalid HTML source code can cause problems when the search 
engine robot tries to access your web pages. While some search 
engines are beginning to be able to index dynamically generated 
pages and Flash (e.g. Google and AllTheWeb), use of some of these 
technologies can hinder your ability to be indexed by the search 
engine robots.

Text in images cannot be read by the search engine robots. Using 
ALT image text is an important way to help the robots "read" your 
images. Websites with extensive images rely heavily on ALT text 
to present their content.

How Do I Get The Most Out Of Indexing?

If you know what to "feed" the spidering robots you will help 
yourself with search engine ranking. 

Having a website full of good content is the major factor. Search 
engines exist to serve their visitors, not to rank your website. 
You need to be sure to present your site in a way that will be 
most useful to the search engine visitor. Each search engine has 
its own idea of what is important in a page, but they all value 
text highly. Making sure that the text on your pages includes 
your most important keyword phrases will help the search engine 
evaluate the content of those pages. 

Making sure that you have good title and meta tags will further 
assist the search engines in understanding what your page is 
about. If the text on the page is about widgets, the title is 
about widgets, and the meta tags are about widgets, the search
engine will have a pretty good idea that you are all about 
widgets. When their visitors search for widgets, the search 
engines know to list your site in the results.

A sitemap page is a very good way of giving the search engine 
robot every opportunity to reach your website pages. Since 
robots click through the links of your web pages, make sure that 
at least your most important pages are included in the sitemap; 
you may even want to include all your pages there, depending on 
the size of your site. Be sure to add a link to the sitemap page 
from each page on your site.

Another important consideration is that of keeping all of your 
pages within a small number of "clicks" from your top page. Many 
robots will not follow links more than two or three levels deep, 
so if your "widgets" page can only be reached from your home page 
by following multiple links (e.g. home page >> about us page >> 
products page >> widgets page), the robot may not crawl deep 
enough to get to the widgets page. 

Testing Your Website For Search Engine Robot Accessibility

To get an idea just what the search engine robot "sees" on your 
page, you can look at the Sim Spider tool. You may be surprised 
at how different your site looks to the robot. You can find this 
tool at http://www.searchengineworld.com/cgi-bin/sim_spider.cgi

You will see text and ALT image text show up in the results. If 
your entire website is built in Flash, you will see nothing at 
all because robots don't understand Flash movies. 

The Bottom Line

When it comes to search engine robots, think simply. Lots of 
good content and text, hyperlinks the robots can follow,
optimization of your pages, topical links pointing back to your 
site and a sitemap will help insure the best results when the
robots come visiting.

Resources

SpiderSpotting - Search Engine Watch 
http://searchenginewatch.com/webmasters/spiders.html

Robotstxt.org 
List of robots and protocols for setting up a robots.txt file. 
http://www.robotstxt.org/

Spider-Food 
Tutorials, forums and articles about Search Engine spiders and 
Search Engine Marketing. http://spider-food.net/

Spiderhunter.com 
Articles and resources about tracking Search Engine spiders. 
http://www.spiderhunter.com/

Sim Spider Search Engine Robot Simulator 
Search Engine World has a spider that simulates what the Search 
Engine robots read from your website. 
http://www.searchengineworld.com/cgi-bin/sim_spider.cgi 


================================================================
Daria Goetsch is the founder and Search Engine Marketing Consultant 
for Search Innovation Marketing (www.searchinnovation.com), a 
Search Engine Promotion company serving small businesses. Besides 
running her own company, Daria is an associate of WebMama.com, 
an Internet web marketing strategies company. She has specialized 
in search engine optimization since 1998, including three years 
as the Search Engine Specialist for O'Reilly & Associates, a 
technical book publishing company.
================================================================